Use the Climate Data Catalog#

Once we generate the catalog in the other notebook, we can use the catalog!

Imports#

import intake
from distributed import Client, LocalCluster
import hvplot.xarray
import matplotlib.pyplot as plt
import holoviews as hv
hv.extension("bokeh")

Spin up a Dask Cluster#

cluster = LocalCluster()
client = Client(cluster)
client
2022-05-12 19:17:34,732 - distributed.diskutils - INFO - Found stale lock file and directory '/Users/mgrover/git_repos/cloud-for-climate/notebooks/dask-worker-space/worker-ihnwtxn5', purging
2022-05-12 19:17:34,736 - distributed.diskutils - INFO - Found stale lock file and directory '/Users/mgrover/git_repos/cloud-for-climate/notebooks/dask-worker-space/worker-rpinhfj7', purging
2022-05-12 19:17:34,737 - distributed.diskutils - INFO - Found stale lock file and directory '/Users/mgrover/git_repos/cloud-for-climate/notebooks/dask-worker-space/worker-k9cqvfin', purging
2022-05-12 19:17:34,738 - distributed.diskutils - INFO - Found stale lock file and directory '/Users/mgrover/git_repos/cloud-for-climate/notebooks/dask-worker-space/worker-l3mrmt5k', purging

Client

Client-16c33222-d252-11ec-9b4a-acde48001122

Connection method: Cluster object Cluster type: distributed.LocalCluster
Dashboard: http://127.0.0.1:8787/status

Cluster Info

Access the data#

We have an intake catalog we can read in!

data_catalog = intake.open_catalog('catalogs/test-catalog.yml')
data_catalog["cesm-test-dataset"]
cesm-test-dataset:
  args:
    storage_options:
      fo: merged-data.json
      remote_options:
        token: anon
      remote_protocol: s3
    urlpath: reference://
  description: CESM Test Dataset
  driver: intake_xarray.xzarr.ZarrSource
  metadata:
    catalog_dir: /Users/mgrover/git_repos/cloud-for-climate/notebooks/catalogs/

Load the Data Using Dask#

ds = data_catalog["cesm-test-dataset"].to_dask()
ds
<xarray.Dataset>
Dimensions:                  (time: 12, lat: 192, ilev: 71, lev: 70, lon: 288,
                              nbnd: 2, zlon: 1)
Coordinates:
  * ilev                     (ilev) float64 4.5e-06 7.42e-06 ... 985.1 1e+03
  * lat                      (lat) float64 -90.0 -89.06 -88.12 ... 89.06 90.0
  * lev                      (lev) float64 5.96e-06 9.827e-06 ... 976.3 992.6
  * lon                      (lon) float64 0.0 1.25 2.5 ... 356.2 357.5 358.8
  * time                     (time) object 2035-02-01 00:00:00 ... 2036-01-01...
  * zlon                     (zlon) float64 0.0
Dimensions without coordinates: nbnd
Data variables: (12/36)
    P0                       (time) float64 dask.array<chunksize=(1,), meta=np.ndarray>
    ch4vmr                   (time) float64 dask.array<chunksize=(12,), meta=np.ndarray>
    co2vmr                   (time) float64 dask.array<chunksize=(12,), meta=np.ndarray>
    date                     (time) float64 dask.array<chunksize=(12,), meta=np.ndarray>
    date_written             (time) object dask.array<chunksize=(1,), meta=np.ndarray>
    datesec                  (time) float64 dask.array<chunksize=(12,), meta=np.ndarray>
    ...                       ...
    sol_tsi                  (time) float64 dask.array<chunksize=(12,), meta=np.ndarray>
    time_bnds                (time, nbnd) object dask.array<chunksize=(1, 2), meta=np.ndarray>
    time_written             (time) object dask.array<chunksize=(1,), meta=np.ndarray>
    wet_deposition_NHx_as_N  (time, lat, lon) float32 dask.array<chunksize=(1, 192, 288), meta=np.ndarray>
    wet_deposition_NOy_as_N  (time, lat, lon) float32 dask.array<chunksize=(1, 192, 288), meta=np.ndarray>
    zlon_bnds                (time, zlon, nbnd) float64 dask.array<chunksize=(1, 1, 2), meta=np.ndarray>
Attributes:
    Conventions:       CF-1.0
    case:              b.e21.BW.f09_g17.SSP245-TSMLT-GAUSS-LOWER-0.5.001
    host:               
    initial_file:      b.e21.BWSSP245cmip6.f09_g17.CMIP6-SSP2-4.5-WACCM.001.c...
    logname:           geostrat
    model_doi_url:     https://doi.org/10.5065/D67H1H0V
    source:            CAM
    time_period_freq:  month_1
    topography_file:   /scratch/geostrat/inputdata/atm/cam/topo/fv_0.9x1.25_n...

Investigate our Dataset#

Let’s investigate our dataset!

Plot Using Matplotlib#

We can start with a single time step

ds.wet_deposition_NHx_as_N.isel(time=0).plot();
_images/use-the-catalog_13_0.png

And a single point

ds.wet_deposition_NHx_as_N.sel(lat=41.8781,
                               lon=-87.6298,
                               method='nearest').plot()
plt.title('NHx Wet Deposition near Chicago, IL')
Text(0.5, 1.0, 'NHx Wet Deposition near Chicago, IL')
_images/use-the-catalog_15_1.png

Plot Using hvPlot#

Let’s use an interactive plotting library!

We can start with a single time step

ds.wet_deposition_NHx_as_N.isel(time=0).hvplot(cmap='reds')

And a single point

ds.wet_deposition_NHx_as_N.sel(lat=41.8781,
                               lon=-87.6298,
                               method='nearest').hvplot.line(title='NHx Wet Deposition near Chicago, IL')
WARNING:param.CurvePlot02951: Converting cftime.datetime from a non-standard calendar (noleap) to a standard calendar for plotting. This may lead to subtle errors in formatting dates, for accurate tick formatting switch to the matplotlib backend.